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Abstract — This paper considers the problem of error correction 
for a cooperative data exchange (CDE) system, where some clients 
are compromised or failed and send false messages. Assuming 
each client possesses a subset of the total messages, we analyze 
the error correction capability when every client is allowed to 
broadcast only one linearly-coded message. Our error correction 
capability bound determines the maximum number of clients 
that can be compromised or failed without jeopardizing the final 
decoding solution at each client. We show that deterministic, 
feasible linear codes exist that can achieve the derived bound. We 
also evaluate random linear codes, where the coding coefficients 
are drawn randomly, and then develop the probability for a client 
to withstand a certain number of compromised or failed peers 
and successfully deduce the complete message for any network 
size and any initial message distributions. 

Index Terms — cooperative data exchange, error correction, 
error detection, network coding, security. 

I. Introduction 

The fundamental challenge of networks is to pull together 
all the available network resources and to arrange all the clients 
in efficient cooperation, such that they can collaboratively 
deliver a quality and trustworthy service. Cooperative data 
exchange (CDE) HI among the clients has become a promising 
approach for achieving efficient data communications. In a 
CDE system, each client initially holds only a subset of 
packets, and is in quest for all the packets (from its peers). 
It is typical to assume that the clients communicate through 
(wireless) broadcast channels. The objective is to design a 
network coded transmission scheme that minimizes the total 
number of transmissions |fl~|-|[3) or the total transmission cost 
E)-||6], and at the same time, ensures all the clients can deduce 
the complete information. 

Most of the existing studies on the CDE problem assume 
that the transmission from every client is reliable and trust- 
worthy. However, in practice, there may exist compromised 
clients who intentionally send false message^ or failed clients 
who send wrong readings. Such could cause decoding error 
or failure, and therefore motivates us to explore of error 
correction for the CDE problem. One example could be in 

W. Song is with the School of Mathematical Sciences, Peking University, 
China, and with Singapore University of Technology and Design, Singapore. 
E-mail: songwentu@gmail.com. 

X. Wang was with Singapore University of Technology and Design, 
Singapore, and with School of Computer and Information, Hefei University 
of Technology, Hefei 230009, China. Email: wxiumin@hfut.edu.cn. 

C. Yuen is with Singapore University of Technology and Design, Singapore. 
Email: yuenchau@sutd.edu. sg. 

T. J. Li is with the department of electrical and computer engineering, 
Lehigh University, Bethlehem, PA 18015, USA. Email: jingli@ece.lehigh.edu. 

R. Feng is with the School of Mathematical Sciences, Peking University, 
China. Email: fengrq@math.pku.edu.cn. 

This research is partly supported by the International Design Center (grant 
no. IDG31100102 and IDD11100101). 

'For simplicity, we consider error-free transmission. A transmission error 
may be treated as an error-free transmission from a compromised client. 



a sensor network, when one sensor fails, how can we detect 
and correct the error through the readings of other sensors. 

In the literature, several interesting works Q, JS] have 
studied the problem of network coding based error correction 
0, iflOl . but these works cannot be applied in CDE. This 
is because the existing studies assume that there exists a 
single source node (sender) in possession of all the packets, 
whereas in the CDE problem, there exist multiple source nodes 
(senders), each equipped with only a subset of the packets. 

Assuming that each client initially holds a subset of the 
messages, we investigate the error correction capability that 
a "fair and once" cooperative data exchange scheme can 
achieve, where "fair and once" means each client is allowed 
to broadcast exactly one packet. We say a CDE transmission 
scheme is a <5-error correction solution if it guarantees the 
correct recovery of the complete messages by all the clients, 
in the presence of up to S comprised clients. The contributions 
of this paper include 

> Given initial message distribution, we derive the error 
correction capability for a linear-coded CDE problem, 
which specifies the maximum number of compromised 
clients the system can tolerate without jeopardizing ulti- 
mate integrity and accuracy of the message at each client. 

• We show that deterministic, feasible linear code designs 
exist to achieve the derived error correction capability. 

• Since deterministic coding schemes are inflexible and 
unscalable, we also investigate the case of random linear 
network coding. We derive the ensemble average proba- 
bility for any client to correctly deduce all the messages 
despite the existence of certain compromised peers. 

The rest of this paper is organized as follows. Sec. ITT1 
formulates the problem. Sec.|III]develops error correction for a 
general CDE problem. We discuss the performance of random 
network coding in Sec. [IV] and conclude the paper in Sec. [V] 

II. Problem Definition and Signal Model 

Consider a set of k packets X = {xi,X2, ■ ■ ■ ,%k} to be 
delivered to n clients in R = {?"i,r2,-- - ,r n }, where each 
message Xi is assumed to be an element of a finite field 
F. Suppose that initially, client rj G R holds a subset of 
packets {xi}i^Aj, and the clients collectively have all the 
packets in X, i.e., [J r . eR Aj = {1, ■ ■ • To simplify the 
presentation, we use Aj to denote the index set of the missing 
packets of client rj, i.e., Aj = {!.,••• ,k} \ Aj, and use 
\Aj\ to denote the size of Aj. Following the system model 
in |fl~), the clients will exchange packets over a common 
broadcast channel to assist each other to correctly obtain all 
of its missing packet(s). This problem, thereafter referred to 
as the cooperative data exchange problem, is denoted by the 
quaternary H = (k, n, X, X), where X = {A\, ■ ■ ■ , A n }. 
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We assume that each client is permitted to use the common 
broadcast channel exactly once. There are n clients and each 
takes turn to broadcast. In the jth round, the client Tj broad- 
casts an encoded packet yj which is an F-linear combination 
of the packets it initially has, i.e., yj = Yli=i a i,j x i> where 
di t j G F and Ojj = if i G Aj. The matrix (aij)kxn specifies 
a transmission scheme for the CDE problem TL = (k, n, X, X) 
and is called an encoding matrix of this problem. 

We define the error correction problem for a general CDE 
problem TL = (k,n, X, X) as follows: 

Definition 1 The 8-error correction problem for the CDE 
problem TL = (k, n, X, X) is to find a transmission scheme 
such that each client Tj can correctly recover all the packets 
in X, so long as there are no more than 8 compromised clients. 

Definition 2 The incidence matrix of TL = (k,n,X,X) is 
defined as the matrix C — (d.j)kxn, where £jj is a variable 
if i G Aj, and £jj = otherwise. The local incidence matrix 
of Yj, denoted by Cj, is defined as the sub-matrix ofC, which 
only includes the row vectors with indices in Aj. 

Remark 1 Clearly, an arbitrary encoding matrix is obtained 
by assigning a value in F to each ^j in the incidence matrix, 
where F is the support field of encoding and decoding. 

Example 1 : Consider a CDE problem in which there are six 
messages X\, ■ • ■ ,Xq and six clients n, • ■ ■ , r§, where each 
message Xi is an element of the ternary field F3 = {0, 1, 2}. 
Suppose initially, the client Ti knows a subset A4 of the 
messages, where A\ = {1,3, 6}, A 2 = {2, 3, 4}, A3 = 
{1,2,5}, A 4 = {3,4,5}, A 5 = {2,4,6} and A 6 = {1,5,6}. 
We have the incidence matrix is 
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The local incidence matrix 


of the client T\ 


is: 
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Note that the elements in jth column of the local incidence 
matrix of client Tj are all zero. This is because, the jth column 
vector denotes the encoding vector of the packet sent by itself, 
and the packets in {xi} iG -j- are unknown to rj. Based on the 
above definition, we further define the following matrix. 
Definition 3 Let E = (aij)kxn be an encoding matrix of the 
CDE problem TL = (k,n, X, X). The local receiving matrix 
of Tj is defined as the sub-matrix of E, which includes the 
row vectors of E with indices in Aj. 

Example 2 Consider the CDE problem in Example [7] the 
communication is completed by six rounds: in the ith round, 
the clients rj broadcast yi to all other clients, where 2/1 = 
xi +x 3 + x 6 , y 2 — x 2 + x 3 + Xi, 2/3 = x\ + x 2 + x 5 , y± = 
X3 + 2x4 + £5. 2/5 = x 2 + 2x4 + xq and y§ = x\-\- 2^5 + 2x6, 
which is specified by the following encoding matrix 

( 1 1 1 \ 

110 10 

110 10 
b ~ 1 2 2 

110 2 
\ 1 1 2 / 



For n, he will receive y 2 , 2/3 , 2/4 , 2/5 , ye- Since r\ knows x\, £3 
and xq, then he can compute z 2 — y 2 — X3 = x 2 + £4, Z3 = 
2/3 ~ xi = x 2 + x 5 , z i = y i ~- x 3 = Xi + x 5 , z 5 = y 5 - x 6 = 
x 2 + X4 and zq = j/6 — x i — = 2x5. So for r\, he has 
the equation (0, Z2, Z3, Z4, 25, zq) = (x 2 , X4, x§)Ei and can 
uniquely solve the value of x 2 ,Xi,x^ from z 2 , Z3, Z4, 25, z§, 
where E\ is the local receiving matrix of the client r\ is 

( 1 1 1 \ 
£■1 = 1 2 2 . 
\ 1 1 2 J 

III. Error Correction 

Given the initial information held by each client, in this 
section, we will first derive the error correction capability, 
8, for the fair-and-once CDE problem TL = (k, n,X, X). 
We will demonstrate, in the next section, the tightness and 
achieveability of 6 by demonstrating feasible code designs. 

To simplify the presentation, we write the packet set X = 
{xi, ■ ■ ■ ,Xk} as a vector X = (xi, ■ ■ ■ , Xk). For any vector 
u, we let wt(u) denote the Hamming weight of u, i.e., wt(u) 
is the number of non-zero components in u. If C is a linear 
code of length n and dimension k, then C is referred to as an 
[n, k] linear code. Moreover, if C has a minimum distance d, 
then C is referred to as an [n,k, d) linear code. 

For a CDE problem TL = (k,n,X,X), we define the 
information space of Tj as follows: 

Definition 4 The information space of rj, denoted by Vj, is 
defined as the set of all possible packets of X estimated by the 
client Tj, i.e., Vj — {[x\, ■ ■ ■ , Xk) & V k ; Xi — Xi if i ^ Aj} . 

Clearly, Vj contains altogether |F|I A ^ vectors, correspond- 
ing to all the exhaustive trial decoding solutions, yet only 
one is the true and correct message solution. For exam- 
ple, in Example Q] the information space of r\ is V\ = 
{(xi, x 2 , £3, £4, £5, x 6 ); x 2 ,Xi, x 5 £ F}. 

Since the client Tj knows packets Xi, Vi G Aj, it must de- 
termine, from its received message vector Y = (2/1, ■ • • , y n ), 
a candidate vector X £ Vj as its decoder output. Given a 
received vector Y £ F™, the minimum distance decoder of Tj 
is a map D : F" — > Vj such that the decoder output D(Y) 
satisfies d H (®(Y)E,Y) < d H (XE,Y) for any X G Vj, 
where du(-, •) is the Hamming distance function. 
Lemma 1 A transmission scheme with the encoding matrix 
E is a 8-error correction solution of the CDE problem TL = 
(k, n, X, X) if and only if each local receiving matrix Ej is 
a generating matrix of an [n, \Aj\] linear code with minimum 
distance d > 25 + 1. 

Proof: From the theory of classical error-correcting codes 
fOll . for any client Tj G R, the transmission scheme can 

correct 8' < 8 errors if and only if for any X,X' G Vj and 
X ^ X\ d H (XE, X'E) > 28 + 1, or, equivalently, 

wl{XE - X'E) >28 + 1,V{X,X'} CVj (1) 

Let Uj = {(xi,-- - ,x k ) G ¥ k -x t = 0,V* G Aj} \ {0 k }, 
where 0^ denotes all-zero row vector of length k. Note that 
XE - X'E = (X- X')E. It is easy to see that Vj = {X - 
X';X,X' G Vj and X ± X'}, so Eq. (1) is equivalent to 

wt(XE) > 26 + 1, WX € Uj (2) 
By the definition of Ej and Uj, Eq. (2) is equivalent to 
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wt{XE 3 ) >26 + l,VX€ F 1 ^ \ {0|3j|} (3) 

Eq. (3) means that Ej is a generating matrix of an [n, \Aj\] 
linear code with minimum weight at least 2(5+1. Note that 
the minimum distance of a linear code equals to the minimum 
weight of it, and this proves Lemma Q] ■ 
Lemma Q] shows an important relation between the error- 
correction capability of a transmission scheme and the mini- 
mum distance of the linear code generated by the correspond- 
ing local receiving matrices. The following lemma gives a 
method to determine the minimum distance of a linear code 
from its generating matrix. 

Lemma 2 Suppose C is an [n, k] linear code and G is a 
generating matrix of C. Then the minimum distance of C is 
at least d if and only if any sub-matrix including n — d + 1 
columns of G has rank k (where n — d + 1 > k). 

Proof: Let Gi, ■ ■ ■ , G n be the n columns of G. Note that 
the minimum distance of a linear code equals to the minimum 
weight of it. The minimum distance of C is at least d if and 
only if the minimum weight of C is at least d. That is, for 
any IeF l \ {0 fc }, v/t(XG) > d. Clearly, this condition is 
equivalent to the following condition 

(*): For any IeF l \ {0 fc }, the vector XG has at most n — d 
zero elements. 

For any ■ ■ ■ ,i„_d+i} C {1, • • • , n}, consider the 
system of linear equations 

X(Gi lt - ■ • ,G in _ d+1 ) = O n -d+i (4) 

where I is a vector of k variables. Note that Eq. (|4]i are 
n — d + 1 equations. Then condition (*) holds if and only 
if for any X e F fe \ {0 fc }, X is not a solution of Eq. (4), 
which means that Eq. has only zero solution, i.e., only 
0^ is the solution of it. By the knowledge of linear algebra, 
Eq. © has only zero solution if and only if the submatrix 
(Gj l; • ■ ■ , Gi n _ d+1 ) has rank k. Thus, the minimum distance 
of C is at least d if and only if any submatrix including n—d+l 
columns of G has rank k. ■ 

By remark [T] designing a (5-error correction solution of a 
CDE problem is equivalent to assign a value in F to each 
variable £j j in the incidence matrix such that the resulted local 
receiving matrices satisfy the condition of Lemma [2] (for their 
own parameters). In the following, we focus on the incidence 
matrix of the CDE problem H = (k, n, X, X). 

We use F[£i, ■ • • , £jy] to denote the polynomial ring of the 
variables £,n over the field F. Let r be a positive 

integer. An r x r matrix L over the ring F[£i,--- , £jv] is 
said to be non-singular if the determinant of L is a nonzero 
polynomial in F[£i, • • • , £at]. 

Definition 5 Suppose M is an r x I (r < I) matrix over 
F[£i, ■ ■ • , £at]- The diameter of M is defined as the smallest 
positive integer p such that any p columns of M contain an 
r x r non-singular sub-matrix. 

For a given CDE problem T~L = (k, n, X, X), let pj be the 
diameter of the local incidence matrix of rj, j = 1, • • ■ , n. We 
define the diameter of H as p = maxjpi, • • ■ , p n }. 

Reconsider the CDE problem in Example |2] It is easy to 
verify that pj = 4, Vj E {1, • • ■ ,6}. Thus the diameter of H 
in this example is p = 4. 



Definition 6 For a CDE problem H = (k, n, X, X), let Cj = 
{L; L is a non-singular square sub-matrix of Cj of order \Aj\} 
and C = U" =1 Cj. We then define the character polynomial 
of % as the polynomial 

*»(••• .£«.•••)= \[det{L) 

where det(L) is the determinant of the square matrix L. 

The following lemma transfers the problem of designing a 
(5-error correction solution of H to the problem of finding a 
nonzero point of the character polynomial of "H. 

Lemma 3 Let h(- ■ ■ • • • ) be the character polynomial 
of TL and E = (aij) be an encoding matrix of % such that 
h(- ■ ■ , Ojj, •••) 7^ 0. Then E is a \J^-^-\-error correcting 
solution of %, where p is the diameter of H.. 

Proof: Let Cj and Ej be the local incidence matrix and 
the local receiving matrix of client rj . Since p is the diameter 
of H, any set of p columns of Cj contains a non-singular 
sub-matrix L of order \Aj\, i.e., L <G C. Correspondingly, 
any set of p columns of Ej contains a sub-matrix L' such 
that L' is obtained by replacing £jj by aij in L. Since 
h(- ■ ■ , a,ij, • • • ) + 1 0, we have det(L') ^ 0. As L' has rank 
\Aj\, and it follows that any set of p columns of Ej has rank 
\Aj\. According to Lemma [2] Ej is a generating matrix of an 
[n, \ Aj |] linear code and its minimum distance d > n — p + 1. 
Let (5 = ■ Then, we have 25 < n - p. Thus, 2<5 + 1 < d. 

According to Lemma Q] E is the encoding matrix of a 
error correcting solution of H. ■ 
To make further discussion, we need the following lemma, 
which is a well-known result in algebra (e.g., see ifTTl ). 

Lemma 4 Let • • • , £jv) be a nonzero polynomial in 

F[£i, ■ ■ • ,£n]- For a sufficiently large field F, there exists an 
n-tuple (ai, • • • , a^v) £ ¥ N such that /(ai, • • • , ajv) ^ 0. 

Now, we can prove our main result for deterministic coding. 

Theorem 1 Suppose F is sufficiently large. Then the CDE 
problem % = (k,n,X,X) has a 5-error correcting solution 
if and only if 5 < L^j^J> where p is the diameter ofH. 

Proof: We first prove the sufficiency of the condition by 
assuming that 6 < L^^pJ- According to Lemma[4] there exists 
a feasible assignment for each ^ j with a.jj 6 F, such that 
h(- ■ ■ , djj, • • • ) + 1 0. By Lemma|3] E = (flij) is the encoding 
matrix of a J -error correcting solution of H. That is, E 
is the encoding matrix of a (5-error correcting solution. 

We then prove the necessity of the condition, where we 
assume that H has a (5-error correcting solution with encoding 
matrix E = (aij)kxn- By Lemma [T] each local encoding 
matrix Ej is a generating matrix of a [n, \Aj\] linear code 
with minimum distance d > 25 + 1. By Lemma |2] any set 
of n — d + 1 columns of Ej has rank \Aj\, i.e., any set of 
n — d+l columns of Ej contains a non-singular sub-matrix 
of order \Aj\. Correspondingly, any set of n — d + 1 columns 
of Cj contains a non-singular sub-matrix of order \Aj\ over 
the ring F[- • ■ , ■ ■ ■}. We have pj < n — d + 1 and p = 
max{px, ■ ■ ■ , p n } < n — d+l, where pj is the diameter of 
the local incidence matrix Cj of Tj. Thus, d < n — p + 1. 
Combining the afore-proven result that d > 2(5 + 1, we can 
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deduce that 28 < n — p. Since 5 is an integer, we have 5 < 
Lht^J- Thus, we complete the proof of Theorem Q] ■ 

Consider the CDE problem "H = (6, 6, X, X) in Example Q] 
of which we have shown that the diameter is 4. So by Theorem 
Q] "H has a <5-error correcting solution for any 6 < 1, which 
means the system can tolerate at most one compromised client; 
otherwise, some clients may not be able to correctly deduce all 
the messages. It can also be easily verified that the encoding 
strategy given by the encoding matrix E in Example [2] achieves 
the derived capability 5 < 1. In other words, if any one 
client is compromised and sends a false message intentionally, 
the encoding strategy given by matrix E can detect the false 
message and make sure all the clients can successfully decode 
their missing packets using their local receiving matrices. 

For example, if among 2/2,2/3,2/4,2/5, 2/6 m Example [2] there 
is one, say 2/2, which change to a erroneous value y 2 , then 
Z2 will also change to a erroneous value z 2 = y 2 — X3. 
Since C is with minimum distance 3, (0, Z2, 23, 24, 25, zq) is 
the the nearest codeword to (0, z' 2 , 23, Z4, 25, zg), i.e., for any 
(x' 2 ,x' 4 ,x' 5 ) 7^ (x 2 , £4, x 5 ), (x' 27 x' 4 ,x' 5 )Ei has at least two 
elements different from (0, z' 2 , 23, 24, 25, z§). By the minimum 
distance decoder, we can still obtain the correct value of 

X2,X^X 5 . 

IV. Performance with Random Network Coding 

Although there exists feasible deterministic code designs 
to realize the 5 error-correction, the deterministic encoding 
matrix must be defined and distribute across the network 
system beforehand. This not only incurs extra communication 
overhead, but also makes the system rather inadaptive and 
unscalable, as any change in the network size, or in the 
individual packet sets possessed by the clients, will cause a re- 
computation and re-distribution of the entire coding scheme. 
To make the system more robust, scalable and hence more 
practical, we now consider using random linear network codes 
and evaluate its performance. 

In the distributed, random coding context, each client locally 
and independently generates an encoded packet over its pos- 
session, and broadcasts to all of its peers. The coefficients of 
the encoding vector are randomly selected from a predefined 
field F. Again, assuming there exist malicious clients, we 
are interested in the computing the error tolerance capability 
of the system. Unlike the deterministic case, here the error 
tolerance must be evaluated over the ensemble of the random 
coding schemes, assuming each and every instance is equally 
probable. The analytical result is therefore represented in terms 
of the probability. 

Before further analysis, we introduce the following 
Schwartz-Zippel Lemma (e.g., see lfl2l ). 

Lemma 5 Let • • • , £jv) be a nonzero polynomial of 

degree d > over a field F. Let S be a finite subset of 
F, and the value of each ,£n be selected indepen- 

dently and uniformly at random from S. Then the proba- 
bility that the polynomial equals zero is at most i.e., 
Pr(/(a,---,Ov)=0)<^. 

We now prove our random coding result: 



Theorem 2 Suppose that the character polynomial of the 
CDE problem TL — (k,n,X, X) is of degree d and the size 
of the field F is q > d. Let the encoding coefficients } be 
chosen independently and uniformly at random from F. Then 
the probability that E = (djj) is the encoding matrix of a 
[^-^\-error correcting solution ofH is at least 1 — ^, where 
p is the diameter of %. 

Proof: Let h(- ■ ■ , £; j 5 ■ • ■ ) be the character polynomial 
of H. According to Lemma [5] by randomly selecting a%j 
in the field F, Pr(/i(- • • , a^, ■ ■ ■ ) = 0) < |. Hence, 

Pr(/i(--- ,cn,j,---) ^ 0) > 1 - f ■ From Lemma [3j the 
probability that E = (aij)kxn is the encoding matrix of a 
L-^-j^J -error correcting solution of H. is at least 1 — |. ■ 
Remark 2: Clearly, the degree of the character polynomial 
of the CDE problem H = (k, n, X, X) only depends on the 
parameters k,n and X and is independent of the field F. By 
Theorem [2] if the field F is sufficiently large, with a high 
probability, we can obtain a [^^7^ J -error correcting solution 
of "H by randomly choosing the encoding coefficients from the 
given field. 

V. Conclusion 
We have studied the error correction capability for a net- 
work coded data exchange problem. Assuming every client 
in the network is allowed to exchange only one message, 
we develop a tight upper bound on the maximal clients 
that can be compromised or failed without undermining the 
final messages. We show that deterministic schemes exist to 
achieve the bound. For the system to be more scalable, we 
also consider random coding, and develop the probability that 
each client can successfully identify the erroneous message 
and deduce the complete information. It is worthy remark 
that since the encoding matrix is restrict, the construction 
technique in classical linear code can not apply to the CDE 
error correction code. Thus, we give rise to a new problem in 
code design. 
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